Add TRN hybrid non-record submission (1.4942 bpb, 1x RTX 5090) by amabito · Pull Request #669 · openai/parameter-golf

amabito · 2026-03-25T02:15:44Z

Non-record submission: oscillatory recurrence + attention hybrid under the 16 MB constraint.

What this is

A 10-layer hybrid model (7 TRN layers + 3 attention layers) with int5 QAT and
zstd-22 compression. The TRN layers use a Kogge-Stone parallel prefix scan over
complex-valued oscillators -- no Triton, no custom CUDA, pure PyTorch.

Score: 1.4942 bpb (int5 roundtrip, 636 steps / 600s wallclock, 1x RTX 5090).
Artifact: 15.28 MB.

What went wrong

The model reaches 1.26 bpb in fp32 at 20K steps, but int5 quantization degrades
it to 1.93 bpb. The oscillator projection weights (d_model -> 6K, encoding
frequency and phase) accumulate O(t) phase drift from quantization errors.
At 1000 steps the error is small (+0.041); at 20K steps it collapses (+0.669).

A parameter-matched 13L Transformer shows only +0.016 int5 degradation at the
same step count. The failure is specific to oscillatory recurrence parameters.

What is included

records/track_non_record_16mb/2026-03-25_TRN_Hybrid_Int5_1x5090/
- README.md (architecture, ablations, quantization analysis, 13L comparison)
- submission.json
- train.log
- train_gpt_trn.py (self-contained, zero external dependencies)
README.md root table updated (Non-Record Runs)

What is not included

3-seed runs (single seed only)
8xH100 results (tested on 1x RTX 5090 only)

Oscillatory recurrence + attention hybrid under 16 MB constraint. 10 layers (7 TRN + 3 Attn), int5 QAT, Kogge-Stone parallel scan. Int5 collapses at 20K steps due to oscillator projection phase drift.

Add TRN hybrid non-record submission (1.4942 bpb, 1x RTX 5090)

133b0c1

Oscillatory recurrence + attention hybrid under 16 MB constraint. 10 layers (7 TRN + 3 Attn), int5 QAT, Kogge-Stone parallel scan. Int5 collapses at 20K steps due to oscillator projection phase drift.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TRN hybrid non-record submission (1.4942 bpb, 1x RTX 5090)#669

Add TRN hybrid non-record submission (1.4942 bpb, 1x RTX 5090)#669
amabito wants to merge 1 commit intoopenai:mainfrom
amabito:trn-hybrid-submission

amabito commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amabito commented Mar 25, 2026

What this is

What went wrong

What is included

What is not included

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant